Computer vision can be used in health care for identifying diseases. In Pneumonia detection we need to detect Inflammation of the lungs. In this project, you’re required to build an algorithm to detect a visual signal for pneumonia in medical images. Specifically, your algorithm needs to automatically locate lung opacities on chest radiographs
In the dataset, some of the features are labeled “Not Normal No Lung Opacity”. This extra third class indicates that while pneumonia was determined not to be present, there was nonetheless some type of abnormality on the image and oftentimes this finding may mimic the appearance of true pneumonia. Dicom original images: - Medical images are stored in a special format called DICOM files (*.dcm). They contain a combination of header metadata as well as underlying raw image arrays for pixel data
</div>!pip install pydicom
import pydicom as dcm
import cv2
import os
import pandas as pd
import numpy as np
import glob
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.layers import BatchNormalization, Activation
from tensorflow.keras.layers import Dropout
from tensorflow.keras import layers
from tensorflow.keras import regularizers
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.optimizers import Adam
from tqdm.notebook import tqdm
from google.colab import drive
drive.mount('/content/drive')
Requirement already satisfied: pydicom in /usr/local/lib/python3.10/dist-packages (2.4.3)
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
import zipfile
import os
zip_ref = zipfile.ZipFile('/content/drive/MyDrive/rsna-pneumonia-detection-challenge.zip', 'r') #Opens the zip file in read mode
zip_ref.extractall('/tmp') #Extracts the files into the /tmp folder
zip_ref.close()
len(os.listdir('/tmp/stage_2_train_images'))
26684
# importing labels info
labels_df = pd.read_csv("/content/drive/MyDrive/Capstone project - Pneumonia detection project/stage_2_train_labels.csv")
labels_df.head()
| patientId | x | y | width | height | Target | |
|---|---|---|---|---|---|---|
| 0 | 0004cfab-14fd-4e49-80ba-63a80b6bddd6 | NaN | NaN | NaN | NaN | 0 |
| 1 | 00313ee0-9eaa-42f4-b0ab-c148ed3241cd | NaN | NaN | NaN | NaN | 0 |
| 2 | 00322d4d-1c29-4943-afc9-b6754be640eb | NaN | NaN | NaN | NaN | 0 |
| 3 | 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 | NaN | NaN | NaN | NaN | 0 |
| 4 | 00436515-870c-4b36-a041-de91049b9ab4 | 264.0 | 152.0 | 213.0 | 379.0 | 1 |
labels_df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 30227 entries, 0 to 30226 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 patientId 30227 non-null object 1 x 9555 non-null float64 2 y 9555 non-null float64 3 width 9555 non-null float64 4 height 9555 non-null float64 5 Target 30227 non-null int64 dtypes: float64(4), int64(1), object(1) memory usage: 1.4+ MB
labels_df.isnull().sum()
patientId 0 x 20672 y 20672 width 20672 height 20672 Target 0 dtype: int64
labels_df.shape
(30227, 6)
# importing detailed class info
detailed_info_df = pd.read_csv("/content/drive/MyDrive/Capstone project - Pneumonia detection project/stage_2_detailed_class_info.csv")
detailed_info_df.head()
| patientId | class | |
|---|---|---|
| 0 | 0004cfab-14fd-4e49-80ba-63a80b6bddd6 | No Lung Opacity / Not Normal |
| 1 | 00313ee0-9eaa-42f4-b0ab-c148ed3241cd | No Lung Opacity / Not Normal |
| 2 | 00322d4d-1c29-4943-afc9-b6754be640eb | No Lung Opacity / Not Normal |
| 3 | 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 | Normal |
| 4 | 00436515-870c-4b36-a041-de91049b9ab4 | Lung Opacity |
detailed_info_df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 30227 entries, 0 to 30226 Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 patientId 30227 non-null object 1 class 30227 non-null object dtypes: object(2) memory usage: 472.4+ KB
detailed_info_df.isnull().sum()
patientId 0 class 0 dtype: int64
detailed_info_df.shape
(30227, 2)
# defining image path and counting the number of images we have in train image set
image_path = "/tmp/stage_2_train_images"
number_of_train_images = os.listdir(image_path)
print("Total images in train dataset: ", len(number_of_train_images))
Total images in train dataset: 26684
# defining test image path and counting the number of images we have in test image set
test_image_path = "/tmp/stage_2_test_images"
number_of_test_images = os.listdir(test_image_path)
print("Total test images in train dataset: ", len(number_of_test_images))
Total test images in train dataset: 3000
# sum up the train and test images
total_images = len(number_of_test_images) + len(number_of_train_images)
print("Total number of images(train+test): ", total_images)
Total number of images(train+test): 29684
Assuming all the labels and annotations are in the same CSV files for train set and test set. Because the sum of total images from train and test are very close to the total of detailed class info and labels CSV.
# importing submission sample
submission_sample_df = pd.read_csv("/content/drive/MyDrive/Capstone project - Pneumonia detection project/stage_2_sample_submission.csv")
submission_sample_df.head()
| patientId | PredictionString | |
|---|---|---|
| 0 | 0000a175-0e68-4ca4-b1af-167204a7e0bc | 0.5 0 0 100 100 |
| 1 | 0005d3cc-3c3f-40b9-93c3-46231c3eb813 | 0.5 0 0 100 100 |
| 2 | 000686d7-f4fc-448d-97a0-44fa9c5d3aa6 | 0.5 0 0 100 100 |
| 3 | 000e3a7d-c0ca-4349-bb26-5af2d8993c3d | 0.5 0 0 100 100 |
| 4 | 00100a24-854d-423d-a092-edcf6179e061 | 0.5 0 0 100 100 |
# add another column image path to pnemonia dataframe to combine image paths with the classes
labels_df['image_path'] = ''
for i in range(len(labels_df)):
patient_id = labels_df['patientId'][i]
file_path = os.path.join(image_path, patient_id + '.dcm')
labels_df['image_path'][i] = file_path
<ipython-input-15-22e00841d9c9>:7: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy labels_df['image_path'][i] = file_path
empty_imagepath_count = labels_df['image_path'].str.count('^$').sum()
print(f'Empty image count: {empty_imagepath_count}')
Empty image count: 0
labels_df.shape
(30227, 7)
labels_df.head()
| patientId | x | y | width | height | Target | image_path | |
|---|---|---|---|---|---|---|---|
| 0 | 0004cfab-14fd-4e49-80ba-63a80b6bddd6 | NaN | NaN | NaN | NaN | 0 | /tmp/stage_2_train_images/0004cfab-14fd-4e49-8... |
| 1 | 00313ee0-9eaa-42f4-b0ab-c148ed3241cd | NaN | NaN | NaN | NaN | 0 | /tmp/stage_2_train_images/00313ee0-9eaa-42f4-b... |
| 2 | 00322d4d-1c29-4943-afc9-b6754be640eb | NaN | NaN | NaN | NaN | 0 | /tmp/stage_2_train_images/00322d4d-1c29-4943-a... |
| 3 | 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 | NaN | NaN | NaN | NaN | 0 | /tmp/stage_2_train_images/003d8fa0-6bf1-40ed-b... |
| 4 | 00436515-870c-4b36-a041-de91049b9ab4 | 264.0 | 152.0 | 213.0 | 379.0 | 1 | /tmp/stage_2_train_images/00436515-870c-4b36-a... |
# reading dicom data from the pneumonia dataframe
dicom_data = dcm.read_file(labels_df['image_path'][0])
print(dicom_data)
Dataset.file_meta ------------------------------- (0002, 0000) File Meta Information Group Length UL: 202 (0002, 0001) File Meta Information Version OB: b'\x00\x01' (0002, 0002) Media Storage SOP Class UID UI: Secondary Capture Image Storage (0002, 0003) Media Storage SOP Instance UID UI: 1.2.276.0.7230010.3.1.4.8323329.28530.1517874485.775526 (0002, 0010) Transfer Syntax UID UI: JPEG Baseline (Process 1) (0002, 0012) Implementation Class UID UI: 1.2.276.0.7230010.3.0.3.6.0 (0002, 0013) Implementation Version Name SH: 'OFFIS_DCMTK_360' ------------------------------------------------- (0008, 0005) Specific Character Set CS: 'ISO_IR 100' (0008, 0016) SOP Class UID UI: Secondary Capture Image Storage (0008, 0018) SOP Instance UID UI: 1.2.276.0.7230010.3.1.4.8323329.28530.1517874485.775526 (0008, 0020) Study Date DA: '19010101' (0008, 0030) Study Time TM: '000000.00' (0008, 0050) Accession Number SH: '' (0008, 0060) Modality CS: 'CR' (0008, 0064) Conversion Type CS: 'WSD' (0008, 0090) Referring Physician's Name PN: '' (0008, 103e) Series Description LO: 'view: PA' (0010, 0010) Patient's Name PN: '0004cfab-14fd-4e49-80ba-63a80b6bddd6' (0010, 0020) Patient ID LO: '0004cfab-14fd-4e49-80ba-63a80b6bddd6' (0010, 0030) Patient's Birth Date DA: '' (0010, 0040) Patient's Sex CS: 'F' (0010, 1010) Patient's Age AS: '51' (0018, 0015) Body Part Examined CS: 'CHEST' (0018, 5101) View Position CS: 'PA' (0020, 000d) Study Instance UID UI: 1.2.276.0.7230010.3.1.2.8323329.28530.1517874485.775525 (0020, 000e) Series Instance UID UI: 1.2.276.0.7230010.3.1.3.8323329.28530.1517874485.775524 (0020, 0010) Study ID SH: '' (0020, 0011) Series Number IS: '1' (0020, 0013) Instance Number IS: '1' (0020, 0020) Patient Orientation CS: '' (0028, 0002) Samples per Pixel US: 1 (0028, 0004) Photometric Interpretation CS: 'MONOCHROME2' (0028, 0010) Rows US: 1024 (0028, 0011) Columns US: 1024 (0028, 0030) Pixel Spacing DS: [0.14300000000000002, 0.14300000000000002] (0028, 0100) Bits Allocated US: 8 (0028, 0101) Bits Stored US: 8 (0028, 0102) High Bit US: 7 (0028, 0103) Pixel Representation US: 0 (0028, 2110) Lossy Image Compression CS: '01' (0028, 2114) Lossy Image Compression Method CS: 'ISO_10918_1' (7fe0, 0010) Pixel Data OB: Array of 142006 elements
# mapping training and testing images with the annotations by combining labels_df and detailed_info_df
pneumonia_df = pd.merge(detailed_info_df, labels_df, on='patientId', how='inner')
pneumonia_df.shape
(37629, 8)
Looks like sum duplicate records come while merging, let's use unique records
pneumonia_df.head()
| patientId | class | x | y | width | height | Target | image_path | |
|---|---|---|---|---|---|---|---|---|
| 0 | 0004cfab-14fd-4e49-80ba-63a80b6bddd6 | No Lung Opacity / Not Normal | NaN | NaN | NaN | NaN | 0 | /tmp/stage_2_train_images/0004cfab-14fd-4e49-8... |
| 1 | 00313ee0-9eaa-42f4-b0ab-c148ed3241cd | No Lung Opacity / Not Normal | NaN | NaN | NaN | NaN | 0 | /tmp/stage_2_train_images/00313ee0-9eaa-42f4-b... |
| 2 | 00322d4d-1c29-4943-afc9-b6754be640eb | No Lung Opacity / Not Normal | NaN | NaN | NaN | NaN | 0 | /tmp/stage_2_train_images/00322d4d-1c29-4943-a... |
| 3 | 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 | Normal | NaN | NaN | NaN | NaN | 0 | /tmp/stage_2_train_images/003d8fa0-6bf1-40ed-b... |
| 4 | 00436515-870c-4b36-a041-de91049b9ab4 | Lung Opacity | 264.0 | 152.0 | 213.0 | 379.0 | 1 | /tmp/stage_2_train_images/00436515-870c-4b36-a... |
pneumonia_df = pd.concat([detailed_info_df.drop(columns = 'patientId'), labels_df], axis = 1)
pneumonia_df.shape
(30227, 8)
pneumonia_df['patientId'].nunique()
26684
sns.histplot(data = pneumonia_df['Target'])
<Axes: xlabel='Target', ylabel='Count'>
pneumonia_df['Target'].value_counts()
0 20672 1 9555 Name: Target, dtype: int64
We have an imbalance in the target data.
Class 0 - Normal, No Lung Opacity/Not Normal - 20672
Class 1 - Lung Opacity - 9555
sns.displot(data=pneumonia_df, x="class")
<seaborn.axisgrid.FacetGrid at 0x7f5e73e77520>
pneumonia_df['class'].value_counts()
No Lung Opacity / Not Normal 11821 Lung Opacity 9555 Normal 8851 Name: class, dtype: int64
The above clearly shows that the dataset is imbalanced. We can use data augmentation techniques to make a balanced dataset.
# converting the image to a pixel values and resizing the array values.
resized_images = []
boxes = []
for i in tqdm(range(len(pneumonia_df))):
patient_id = pneumonia_df['patientId'][i]
image_path = pneumonia_df['image_path'][i]
target = pneumonia_df['Target'][i]
dicom_data = dcm.read_file(image_path)
img = dicom_data.pixel_array
# Resize image to 224x224
img = cv2.resize(img, (224, 224))
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
resized_images.append(img)
boxes.append(np.array(target, dtype=np.float32))
0%| | 0/30227 [00:00<?, ?it/s]
len(resized_images)
30227
resized_images[0].shape
(224, 224, 3)
plt.figure(figsize=(25,25)) #
for i, image in enumerate(resized_images[:9]):
plt.subplot(3,3,i+1)
plt.imshow(image)
if pneumonia_df.loc[i]["Target"]:
plt.title("Pneumonia", color="red", fontsize=25)
else:
plt.title("No Pneumonia", color="blue", fontsize=25)
plt.axis('off')
plt.show()
resized_images_1 = resized_images.copy()
len(resized_images_1)
30227
plt.figure(figsize=(25,25)) #
for i, image in enumerate(resized_images_1[:9]):
x1 = pneumonia_df['x'][i]
y1 = pneumonia_df['y'][i]
x2 = pneumonia_df['width'][i]
y2 = pneumonia_df['height'][i]
plt.subplot(3,3,i+1)
if np.isnan(x1):
#print(1)
plt.imshow(image)
else:
#print(2)
pt1=int(x1*(224.0/1024.0)),int(y1*(224.0/1024.0))
pt2=int(x2*(224.0/1024.0)),int(y2*(224.0/1024.0))
image = cv2.rectangle(image,pt1,pt2, color=(255,0,0), thickness = 1)
plt.imshow(image)
if pneumonia_df.loc[i]["Target"]:
plt.title("Pneumonia", color="red", fontsize=25)
else:
plt.title("No Pneumonia", color="blue", fontsize=25)
plt.axis('off')
plt.show()
X_train = resized_images[:20000]
X_test = resized_images[20000:30000]
y_train = np.array(boxes[:20000])
y_test = np.array(boxes[20000:30000])
print(len(X_train), len(y_train))
print(len(X_test), len(y_test))
20000 20000 10000 10000
unique_values = set(np.array(y_train))
len(unique_values)
2
cnt = 0
cnt1 =0
for i in range(len(y_train)):
if y_train[i] == 0:
cnt = cnt+1
else:
cnt1 =cnt1+1
print(cnt, cnt1)
13346 6654
Base Model : Resnet
Data : ImBalanced dataset
X_train = np.array(X_train,dtype=np.float32)
X_train = X_train / 255.0
X_test = np.array(X_test,dtype=np.float32)
X_test = X_test / 255.0
type(X_train)
numpy.ndarray
y_train[0]
0.0
X_train.shape
(20000, 224, 224, 3)
y_train[1]
0.0
# Load the pre-trained ResNet50 model with weights from ImageNet
base_model_resnet = ResNet50(weights='imagenet', input_shape=(224, 224, 3), include_top=False, pooling='avg')
# freeze some of the top layers
for layer in base_model_resnet.layers[:-10]:
layer.trainable = False
# Create your custom classifier on top of the pre-trained model
model_resnet = Sequential()
model_resnet.add(base_model_resnet)
model_resnet.add(BatchNormalization())
# Add more layers as needed
model_resnet.add(Dense(256, kernel_regularizer=regularizers.l2(0.01)))
model_resnet.add(BatchNormalization())
model_resnet.add(Activation(activation='relu'))
model_resnet.add(Dropout(0.3))
# Add the final output layer
model_resnet.add(Dense(1, activation='sigmoid')) # Assuming binary classification
model_resnet.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
resnet50 (Functional) (None, 2048) 23587712
batch_normalization (Batch (None, 2048) 8192
Normalization)
dense (Dense) (None, 256) 524544
batch_normalization_1 (Bat (None, 256) 1024
chNormalization)
activation (Activation) (None, 256) 0
dropout (Dropout) (None, 256) 0
dense_1 (Dense) (None, 1) 257
=================================================================
Total params: 24121729 (92.02 MB)
Trainable params: 3945985 (15.05 MB)
Non-trainable params: 20175744 (76.96 MB)
_________________________________________________________________
# Compile the model with a lower learning rate for fine-tuning
optimizer = Adam(learning_rate=0.0001) # Lower learning rate
model_resnet.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
# Fine-tuning the model on the data
history_resnet = model_resnet.fit(X_train, y_train, epochs=20, validation_split=0.2)
Epoch 1/20 500/500 [==============================] - 34s 35ms/step - loss: 3.6497 - accuracy: 0.7268 - val_loss: 2.5634 - val_accuracy: 0.7157 Epoch 2/20 500/500 [==============================] - 13s 26ms/step - loss: 1.9009 - accuracy: 0.7639 - val_loss: 1.5265 - val_accuracy: 0.7322 Epoch 3/20 500/500 [==============================] - 13s 26ms/step - loss: 1.1838 - accuracy: 0.7707 - val_loss: 1.5116 - val_accuracy: 0.5805 Epoch 4/20 500/500 [==============================] - 13s 26ms/step - loss: 0.8459 - accuracy: 0.7791 - val_loss: 0.9462 - val_accuracy: 0.7178 Epoch 5/20 500/500 [==============================] - 13s 26ms/step - loss: 0.6754 - accuracy: 0.7832 - val_loss: 0.8094 - val_accuracy: 0.6837 Epoch 6/20 500/500 [==============================] - 13s 26ms/step - loss: 0.5865 - accuracy: 0.7875 - val_loss: 1.2672 - val_accuracy: 0.5717 Epoch 7/20 500/500 [==============================] - 13s 26ms/step - loss: 0.5383 - accuracy: 0.7952 - val_loss: 0.6162 - val_accuracy: 0.7628 Epoch 8/20 500/500 [==============================] - 13s 26ms/step - loss: 0.5014 - accuracy: 0.8001 - val_loss: 0.5907 - val_accuracy: 0.7533 Epoch 9/20 500/500 [==============================] - 13s 26ms/step - loss: 0.4809 - accuracy: 0.8043 - val_loss: 0.5905 - val_accuracy: 0.7455 Epoch 10/20 500/500 [==============================] - 13s 26ms/step - loss: 0.4641 - accuracy: 0.8076 - val_loss: 1.4452 - val_accuracy: 0.5663 Epoch 11/20 500/500 [==============================] - 13s 26ms/step - loss: 0.4485 - accuracy: 0.8134 - val_loss: 0.8427 - val_accuracy: 0.6385 Epoch 12/20 500/500 [==============================] - 13s 26ms/step - loss: 0.4438 - accuracy: 0.8140 - val_loss: 1.6071 - val_accuracy: 0.4380 Epoch 13/20 500/500 [==============================] - 13s 26ms/step - loss: 0.4337 - accuracy: 0.8164 - val_loss: 1.2573 - val_accuracy: 0.5817 Epoch 14/20 500/500 [==============================] - 13s 26ms/step - loss: 0.4251 - accuracy: 0.8211 - val_loss: 2.2179 - val_accuracy: 0.5652 Epoch 15/20 500/500 [==============================] - 13s 26ms/step - loss: 0.4135 - accuracy: 0.8272 - val_loss: 1.4912 - val_accuracy: 0.5175 Epoch 16/20 500/500 [==============================] - 13s 26ms/step - loss: 0.4013 - accuracy: 0.8314 - val_loss: 1.9489 - val_accuracy: 0.5727 Epoch 17/20 500/500 [==============================] - 13s 26ms/step - loss: 0.3987 - accuracy: 0.8349 - val_loss: 3.4315 - val_accuracy: 0.5663 Epoch 18/20 500/500 [==============================] - 13s 26ms/step - loss: 0.3902 - accuracy: 0.8401 - val_loss: 0.9160 - val_accuracy: 0.5178 Epoch 19/20 500/500 [==============================] - 13s 26ms/step - loss: 0.3835 - accuracy: 0.8422 - val_loss: 0.7251 - val_accuracy: 0.6957 Epoch 20/20 500/500 [==============================] - 13s 26ms/step - loss: 0.3763 - accuracy: 0.8456 - val_loss: 1.1890 - val_accuracy: 0.6633
# Create a single figure with two subplots, arranged horizontally
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Plot training loss and validation loss in the first subplot
ax1.plot(history_resnet.history['val_accuracy'], label='Validation Accuracy')
ax1.plot(history_resnet.history['accuracy'], label='Training Accuracy')
ax1.set_title('Training Loss & Accuracy')
ax1.legend()
# Plot training accuracy and validation accuracy in the second subplot
ax2.plot(history_resnet.history['val_loss'], label='Validation Loss')
ax2.plot(history_resnet.history['loss'], label='Training loss')
ax2.set_title('Validation Loss & Accuracy')
ax2.legend()
plt.tight_layout()
plt.show()
Balanced dataset using data augmentation techniques
# good for balancing out disproportions in the dataset
from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
featurewise_center=False,
samplewise_center=False,
featurewise_std_normalization=False,
samplewise_std_normalization=False,
zca_whitening=False,
rotation_range=90,
zoom_range = 0.1,
width_shift_range=0.1,
height_shift_range=0.1,
horizontal_flip=True,
vertical_flip=True)
datagen.fit(X_train)
history_resnet_1 = model_resnet.fit(datagen.flow(X_train[:6000], y_train[:6000], batch_size=10), epochs=10, validation_data=(X_test, y_test))
Epoch 1/10 600/600 [==============================] - 79s 127ms/step - loss: 0.6725 - accuracy: 0.6848 - val_loss: 0.7159 - val_accuracy: 0.7278 Epoch 2/10 600/600 [==============================] - 66s 110ms/step - loss: 0.6271 - accuracy: 0.7027 - val_loss: 0.6184 - val_accuracy: 0.7310 Epoch 3/10 600/600 [==============================] - 66s 110ms/step - loss: 0.6120 - accuracy: 0.6993 - val_loss: 0.5765 - val_accuracy: 0.7211 Epoch 4/10 600/600 [==============================] - 66s 109ms/step - loss: 0.6046 - accuracy: 0.7117 - val_loss: 0.6513 - val_accuracy: 0.7301 Epoch 5/10 600/600 [==============================] - 66s 110ms/step - loss: 0.5987 - accuracy: 0.7160 - val_loss: 0.6883 - val_accuracy: 0.6222 Epoch 6/10 600/600 [==============================] - 66s 110ms/step - loss: 0.5985 - accuracy: 0.7097 - val_loss: 1.4719 - val_accuracy: 0.3914 Epoch 7/10 600/600 [==============================] - 66s 109ms/step - loss: 0.5960 - accuracy: 0.7120 - val_loss: 0.7094 - val_accuracy: 0.6464 Epoch 8/10 600/600 [==============================] - 66s 110ms/step - loss: 0.5932 - accuracy: 0.7047 - val_loss: 0.6018 - val_accuracy: 0.6907 Epoch 9/10 600/600 [==============================] - 66s 110ms/step - loss: 0.5936 - accuracy: 0.7098 - val_loss: 0.5651 - val_accuracy: 0.7229 Epoch 10/10 600/600 [==============================] - 66s 110ms/step - loss: 0.5865 - accuracy: 0.7113 - val_loss: 0.5153 - val_accuracy: 0.7619
# Create a single figure with two subplots, arranged horizontally
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Plot training loss and validation loss in the first subplot
ax1.plot(history_resnet_1.history['val_accuracy'], label='Validation Accuracy')
ax1.plot(history_resnet_1.history['accuracy'], label='Training Accuracy')
ax1.set_title('Training & validation Accuracy')
ax1.legend()
# Plot training accuracy and validation accuracy in the second subplot
ax2.plot(history_resnet_1.history['val_loss'], label='Validation Loss')
ax2.plot(history_resnet_1.history['loss'], label='Training loss')
ax2.set_title('Training & Validation Loss')
ax2.legend()
plt.tight_layout()
plt.show()
Model Accuracy is looks good on the sample data. We used only 20000 total records for this project with A100 GPU
Model Accuracy is ~84.5%
Model Validation Accuracy is ~66%
Validation accuracy is pretty low, we will further tune this model to get better accuracy.
With the augmented datasets, the model validation accuracy moves to 71%, but at the same time the training has come down to 76%. But it also says the model predicting capability has improved.
Intrim report is attached as a document.
# Using ResNet50 as a base model
tuned_base_model_resnet = ResNet50(weights = 'imagenet', input_shape = (224,224,3) , include_top=False, pooling='avg' )
for l in tuned_base_model_resnet.layers[:-4]:
l.trainable=False
tuned_model_resnet = Sequential()
tuned_model_resnet.add(base_model_resnet)
tuned_model_resnet.add(BatchNormalization())
tuned_model_resnet.add(Dense(256, kernel_regularizer=regularizers.l2(0.01)))
tuned_model_resnet.add(BatchNormalization())
tuned_model_resnet.add(Activation(activation='relu'))
tuned_model_resnet.add(Dropout(0.3))
tuned_model_resnet.add(Dense(128, kernel_regularizer=regularizers.l2(0.01)))
tuned_model_resnet.add(BatchNormalization())
tuned_model_resnet.add(Activation(activation='relu'))
tuned_model_resnet.add(Dropout(0.3))
tuned_model_resnet.add(Dense(64, kernel_regularizer=regularizers.l2(0.01)))
tuned_model_resnet.add(BatchNormalization())
tuned_model_resnet.add(Activation(activation='relu'))
tuned_model_resnet.add(Dropout(0.3))
tuned_model_resnet.add(Dense(32, kernel_regularizer=regularizers.l2(0.01)))
tuned_model_resnet.add(BatchNormalization())
tuned_model_resnet.add(Activation(activation='relu'))
tuned_model_resnet.add(Dropout(0.3))
tuned_model_resnet.add(Dense(16, kernel_regularizer=regularizers.l2(0.01)))
tuned_model_resnet.add(BatchNormalization())
tuned_model_resnet.add(Activation(activation='relu'))
tuned_model_resnet.add(Dropout(0.3))
tuned_model_resnet.add(Dense(1, activation = 'sigmoid'))
tuned_model_resnet.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
resnet50 (Functional) (None, 2048) 23587712
batch_normalization_2 (Bat (None, 2048) 8192
chNormalization)
dense_2 (Dense) (None, 256) 524544
batch_normalization_3 (Bat (None, 256) 1024
chNormalization)
activation_1 (Activation) (None, 256) 0
dropout_1 (Dropout) (None, 256) 0
dense_3 (Dense) (None, 128) 32896
batch_normalization_4 (Bat (None, 128) 512
chNormalization)
activation_2 (Activation) (None, 128) 0
dropout_2 (Dropout) (None, 128) 0
dense_4 (Dense) (None, 64) 8256
batch_normalization_5 (Bat (None, 64) 256
chNormalization)
activation_3 (Activation) (None, 64) 0
dropout_3 (Dropout) (None, 64) 0
dense_5 (Dense) (None, 32) 2080
batch_normalization_6 (Bat (None, 32) 128
chNormalization)
activation_4 (Activation) (None, 32) 0
dropout_4 (Dropout) (None, 32) 0
dense_6 (Dense) (None, 16) 528
batch_normalization_7 (Bat (None, 16) 64
chNormalization)
activation_5 (Activation) (None, 16) 0
dropout_5 (Dropout) (None, 16) 0
dense_7 (Dense) (None, 1) 17
=================================================================
Total params: 24166209 (92.19 MB)
Trainable params: 3989985 (15.22 MB)
Non-trainable params: 20176224 (76.97 MB)
_________________________________________________________________
#optimizer = Adam(learning_rate=0.00001)
tuned_model_resnet.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
tuned_history_resnet = tuned_model_resnet.fit(datagen.flow(X_train[:6000], y_train[:6000], batch_size=10), epochs = 10, validation_data=(X_test[:6000], y_test[:6000]))
Epoch 1/10 600/600 [==============================] - 77s 113ms/step - loss: 3.6876 - accuracy: 0.6208 - val_loss: 2.2882 - val_accuracy: 0.3802 Epoch 2/10 600/600 [==============================] - 64s 107ms/step - loss: 1.4149 - accuracy: 0.6742 - val_loss: 1.1451 - val_accuracy: 0.5830 Epoch 3/10 600/600 [==============================] - 64s 106ms/step - loss: 0.8913 - accuracy: 0.6835 - val_loss: 0.8468 - val_accuracy: 0.5840 Epoch 4/10 600/600 [==============================] - 64s 107ms/step - loss: 0.7278 - accuracy: 0.6900 - val_loss: 0.6091 - val_accuracy: 0.7202 Epoch 5/10 600/600 [==============================] - 64s 106ms/step - loss: 0.6779 - accuracy: 0.6995 - val_loss: 1.4787 - val_accuracy: 0.2930 Epoch 6/10 600/600 [==============================] - 64s 107ms/step - loss: 0.6666 - accuracy: 0.7027 - val_loss: 0.6477 - val_accuracy: 0.7085 Epoch 7/10 600/600 [==============================] - 64s 106ms/step - loss: 0.6643 - accuracy: 0.7058 - val_loss: 0.6136 - val_accuracy: 0.7390 Epoch 8/10 600/600 [==============================] - 64s 106ms/step - loss: 0.6627 - accuracy: 0.6985 - val_loss: 0.9830 - val_accuracy: 0.5232 Epoch 9/10 600/600 [==============================] - 64s 107ms/step - loss: 0.6640 - accuracy: 0.6978 - val_loss: 1.0457 - val_accuracy: 0.3323 Epoch 10/10 600/600 [==============================] - 64s 106ms/step - loss: 0.6732 - accuracy: 0.7033 - val_loss: 0.7323 - val_accuracy: 0.6203
# Create a single figure with two subplots, arranged horizontally
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Plot training loss and validation loss in the first subplot
ax1.plot(tuned_history_resnet.history['val_accuracy'], label='val_accuracy')
ax1.plot(tuned_history_resnet.history['accuracy'], label='Training Accuracy')
ax1.set_title('Training Loss & Accuracy')
ax1.legend()
# Plot training accuracy and validation accuracy in the second subplot
ax2.plot(tuned_history_resnet.history['val_loss'], label='Validation Loss')
ax2.plot(tuned_history_resnet.history['loss'], label='training loss')
ax2.set_title('Validation Loss & Accuracy')
ax2.legend()
plt.tight_layout()
plt.show()
We used data augmented dataset for this project and for tuning, additional layers were added and the model accuracy is below
Model Accuracy is ~70%
Model Validation Accuracy is ~62%
After tuning the parameters of ResNet50, we can see that the validation Accuracy has decreased.
We have already used transfer learning concept to train the model in previous teps, so we would like to take the oppertunity to use some other model to train our model with transfer learning concept.
# Trying EfficientNet
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense
from tensorflow.keras.models import Model
# Load the EfficientNetB0 model pre-trained on ImageNet data
tl_base_model = EfficientNetB0(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
for l in tl_base_model.layers[:-20]:
l.trainable=False
# Add custom classification layers on top of the base model
x = tl_base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(256, activation='relu')(x) # You can adjust the number of units as needed
predictions = Dense(1, activation='sigmoid')(x) # For binary classification
# Create the final model
tl_model = Model(inputs=tl_base_model.input, outputs=predictions)
tl_model.summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_3 (InputLayer) [(None, 224, 224, 3)] 0 []
rescaling (Rescaling) (None, 224, 224, 3) 0 ['input_3[0][0]']
normalization (Normalizati (None, 224, 224, 3) 7 ['rescaling[0][0]']
on)
rescaling_1 (Rescaling) (None, 224, 224, 3) 0 ['normalization[0][0]']
stem_conv_pad (ZeroPadding (None, 225, 225, 3) 0 ['rescaling_1[0][0]']
2D)
stem_conv (Conv2D) (None, 112, 112, 32) 864 ['stem_conv_pad[0][0]']
stem_bn (BatchNormalizatio (None, 112, 112, 32) 128 ['stem_conv[0][0]']
n)
stem_activation (Activatio (None, 112, 112, 32) 0 ['stem_bn[0][0]']
n)
block1a_dwconv (DepthwiseC (None, 112, 112, 32) 288 ['stem_activation[0][0]']
onv2D)
block1a_bn (BatchNormaliza (None, 112, 112, 32) 128 ['block1a_dwconv[0][0]']
tion)
block1a_activation (Activa (None, 112, 112, 32) 0 ['block1a_bn[0][0]']
tion)
block1a_se_squeeze (Global (None, 32) 0 ['block1a_activation[0][0]']
AveragePooling2D)
block1a_se_reshape (Reshap (None, 1, 1, 32) 0 ['block1a_se_squeeze[0][0]']
e)
block1a_se_reduce (Conv2D) (None, 1, 1, 8) 264 ['block1a_se_reshape[0][0]']
block1a_se_expand (Conv2D) (None, 1, 1, 32) 288 ['block1a_se_reduce[0][0]']
block1a_se_excite (Multipl (None, 112, 112, 32) 0 ['block1a_activation[0][0]',
y) 'block1a_se_expand[0][0]']
block1a_project_conv (Conv (None, 112, 112, 16) 512 ['block1a_se_excite[0][0]']
2D)
block1a_project_bn (BatchN (None, 112, 112, 16) 64 ['block1a_project_conv[0][0]']
ormalization)
block2a_expand_conv (Conv2 (None, 112, 112, 96) 1536 ['block1a_project_bn[0][0]']
D)
block2a_expand_bn (BatchNo (None, 112, 112, 96) 384 ['block2a_expand_conv[0][0]']
rmalization)
block2a_expand_activation (None, 112, 112, 96) 0 ['block2a_expand_bn[0][0]']
(Activation)
block2a_dwconv_pad (ZeroPa (None, 113, 113, 96) 0 ['block2a_expand_activation[0]
dding2D) [0]']
block2a_dwconv (DepthwiseC (None, 56, 56, 96) 864 ['block2a_dwconv_pad[0][0]']
onv2D)
block2a_bn (BatchNormaliza (None, 56, 56, 96) 384 ['block2a_dwconv[0][0]']
tion)
block2a_activation (Activa (None, 56, 56, 96) 0 ['block2a_bn[0][0]']
tion)
block2a_se_squeeze (Global (None, 96) 0 ['block2a_activation[0][0]']
AveragePooling2D)
block2a_se_reshape (Reshap (None, 1, 1, 96) 0 ['block2a_se_squeeze[0][0]']
e)
block2a_se_reduce (Conv2D) (None, 1, 1, 4) 388 ['block2a_se_reshape[0][0]']
block2a_se_expand (Conv2D) (None, 1, 1, 96) 480 ['block2a_se_reduce[0][0]']
block2a_se_excite (Multipl (None, 56, 56, 96) 0 ['block2a_activation[0][0]',
y) 'block2a_se_expand[0][0]']
block2a_project_conv (Conv (None, 56, 56, 24) 2304 ['block2a_se_excite[0][0]']
2D)
block2a_project_bn (BatchN (None, 56, 56, 24) 96 ['block2a_project_conv[0][0]']
ormalization)
block2b_expand_conv (Conv2 (None, 56, 56, 144) 3456 ['block2a_project_bn[0][0]']
D)
block2b_expand_bn (BatchNo (None, 56, 56, 144) 576 ['block2b_expand_conv[0][0]']
rmalization)
block2b_expand_activation (None, 56, 56, 144) 0 ['block2b_expand_bn[0][0]']
(Activation)
block2b_dwconv (DepthwiseC (None, 56, 56, 144) 1296 ['block2b_expand_activation[0]
onv2D) [0]']
block2b_bn (BatchNormaliza (None, 56, 56, 144) 576 ['block2b_dwconv[0][0]']
tion)
block2b_activation (Activa (None, 56, 56, 144) 0 ['block2b_bn[0][0]']
tion)
block2b_se_squeeze (Global (None, 144) 0 ['block2b_activation[0][0]']
AveragePooling2D)
block2b_se_reshape (Reshap (None, 1, 1, 144) 0 ['block2b_se_squeeze[0][0]']
e)
block2b_se_reduce (Conv2D) (None, 1, 1, 6) 870 ['block2b_se_reshape[0][0]']
block2b_se_expand (Conv2D) (None, 1, 1, 144) 1008 ['block2b_se_reduce[0][0]']
block2b_se_excite (Multipl (None, 56, 56, 144) 0 ['block2b_activation[0][0]',
y) 'block2b_se_expand[0][0]']
block2b_project_conv (Conv (None, 56, 56, 24) 3456 ['block2b_se_excite[0][0]']
2D)
block2b_project_bn (BatchN (None, 56, 56, 24) 96 ['block2b_project_conv[0][0]']
ormalization)
block2b_drop (Dropout) (None, 56, 56, 24) 0 ['block2b_project_bn[0][0]']
block2b_add (Add) (None, 56, 56, 24) 0 ['block2b_drop[0][0]',
'block2a_project_bn[0][0]']
block3a_expand_conv (Conv2 (None, 56, 56, 144) 3456 ['block2b_add[0][0]']
D)
block3a_expand_bn (BatchNo (None, 56, 56, 144) 576 ['block3a_expand_conv[0][0]']
rmalization)
block3a_expand_activation (None, 56, 56, 144) 0 ['block3a_expand_bn[0][0]']
(Activation)
block3a_dwconv_pad (ZeroPa (None, 59, 59, 144) 0 ['block3a_expand_activation[0]
dding2D) [0]']
block3a_dwconv (DepthwiseC (None, 28, 28, 144) 3600 ['block3a_dwconv_pad[0][0]']
onv2D)
block3a_bn (BatchNormaliza (None, 28, 28, 144) 576 ['block3a_dwconv[0][0]']
tion)
block3a_activation (Activa (None, 28, 28, 144) 0 ['block3a_bn[0][0]']
tion)
block3a_se_squeeze (Global (None, 144) 0 ['block3a_activation[0][0]']
AveragePooling2D)
block3a_se_reshape (Reshap (None, 1, 1, 144) 0 ['block3a_se_squeeze[0][0]']
e)
block3a_se_reduce (Conv2D) (None, 1, 1, 6) 870 ['block3a_se_reshape[0][0]']
block3a_se_expand (Conv2D) (None, 1, 1, 144) 1008 ['block3a_se_reduce[0][0]']
block3a_se_excite (Multipl (None, 28, 28, 144) 0 ['block3a_activation[0][0]',
y) 'block3a_se_expand[0][0]']
block3a_project_conv (Conv (None, 28, 28, 40) 5760 ['block3a_se_excite[0][0]']
2D)
block3a_project_bn (BatchN (None, 28, 28, 40) 160 ['block3a_project_conv[0][0]']
ormalization)
block3b_expand_conv (Conv2 (None, 28, 28, 240) 9600 ['block3a_project_bn[0][0]']
D)
block3b_expand_bn (BatchNo (None, 28, 28, 240) 960 ['block3b_expand_conv[0][0]']
rmalization)
block3b_expand_activation (None, 28, 28, 240) 0 ['block3b_expand_bn[0][0]']
(Activation)
block3b_dwconv (DepthwiseC (None, 28, 28, 240) 6000 ['block3b_expand_activation[0]
onv2D) [0]']
block3b_bn (BatchNormaliza (None, 28, 28, 240) 960 ['block3b_dwconv[0][0]']
tion)
block3b_activation (Activa (None, 28, 28, 240) 0 ['block3b_bn[0][0]']
tion)
block3b_se_squeeze (Global (None, 240) 0 ['block3b_activation[0][0]']
AveragePooling2D)
block3b_se_reshape (Reshap (None, 1, 1, 240) 0 ['block3b_se_squeeze[0][0]']
e)
block3b_se_reduce (Conv2D) (None, 1, 1, 10) 2410 ['block3b_se_reshape[0][0]']
block3b_se_expand (Conv2D) (None, 1, 1, 240) 2640 ['block3b_se_reduce[0][0]']
block3b_se_excite (Multipl (None, 28, 28, 240) 0 ['block3b_activation[0][0]',
y) 'block3b_se_expand[0][0]']
block3b_project_conv (Conv (None, 28, 28, 40) 9600 ['block3b_se_excite[0][0]']
2D)
block3b_project_bn (BatchN (None, 28, 28, 40) 160 ['block3b_project_conv[0][0]']
ormalization)
block3b_drop (Dropout) (None, 28, 28, 40) 0 ['block3b_project_bn[0][0]']
block3b_add (Add) (None, 28, 28, 40) 0 ['block3b_drop[0][0]',
'block3a_project_bn[0][0]']
block4a_expand_conv (Conv2 (None, 28, 28, 240) 9600 ['block3b_add[0][0]']
D)
block4a_expand_bn (BatchNo (None, 28, 28, 240) 960 ['block4a_expand_conv[0][0]']
rmalization)
block4a_expand_activation (None, 28, 28, 240) 0 ['block4a_expand_bn[0][0]']
(Activation)
block4a_dwconv_pad (ZeroPa (None, 29, 29, 240) 0 ['block4a_expand_activation[0]
dding2D) [0]']
block4a_dwconv (DepthwiseC (None, 14, 14, 240) 2160 ['block4a_dwconv_pad[0][0]']
onv2D)
block4a_bn (BatchNormaliza (None, 14, 14, 240) 960 ['block4a_dwconv[0][0]']
tion)
block4a_activation (Activa (None, 14, 14, 240) 0 ['block4a_bn[0][0]']
tion)
block4a_se_squeeze (Global (None, 240) 0 ['block4a_activation[0][0]']
AveragePooling2D)
block4a_se_reshape (Reshap (None, 1, 1, 240) 0 ['block4a_se_squeeze[0][0]']
e)
block4a_se_reduce (Conv2D) (None, 1, 1, 10) 2410 ['block4a_se_reshape[0][0]']
block4a_se_expand (Conv2D) (None, 1, 1, 240) 2640 ['block4a_se_reduce[0][0]']
block4a_se_excite (Multipl (None, 14, 14, 240) 0 ['block4a_activation[0][0]',
y) 'block4a_se_expand[0][0]']
block4a_project_conv (Conv (None, 14, 14, 80) 19200 ['block4a_se_excite[0][0]']
2D)
block4a_project_bn (BatchN (None, 14, 14, 80) 320 ['block4a_project_conv[0][0]']
ormalization)
block4b_expand_conv (Conv2 (None, 14, 14, 480) 38400 ['block4a_project_bn[0][0]']
D)
block4b_expand_bn (BatchNo (None, 14, 14, 480) 1920 ['block4b_expand_conv[0][0]']
rmalization)
block4b_expand_activation (None, 14, 14, 480) 0 ['block4b_expand_bn[0][0]']
(Activation)
block4b_dwconv (DepthwiseC (None, 14, 14, 480) 4320 ['block4b_expand_activation[0]
onv2D) [0]']
block4b_bn (BatchNormaliza (None, 14, 14, 480) 1920 ['block4b_dwconv[0][0]']
tion)
block4b_activation (Activa (None, 14, 14, 480) 0 ['block4b_bn[0][0]']
tion)
block4b_se_squeeze (Global (None, 480) 0 ['block4b_activation[0][0]']
AveragePooling2D)
block4b_se_reshape (Reshap (None, 1, 1, 480) 0 ['block4b_se_squeeze[0][0]']
e)
block4b_se_reduce (Conv2D) (None, 1, 1, 20) 9620 ['block4b_se_reshape[0][0]']
block4b_se_expand (Conv2D) (None, 1, 1, 480) 10080 ['block4b_se_reduce[0][0]']
block4b_se_excite (Multipl (None, 14, 14, 480) 0 ['block4b_activation[0][0]',
y) 'block4b_se_expand[0][0]']
block4b_project_conv (Conv (None, 14, 14, 80) 38400 ['block4b_se_excite[0][0]']
2D)
block4b_project_bn (BatchN (None, 14, 14, 80) 320 ['block4b_project_conv[0][0]']
ormalization)
block4b_drop (Dropout) (None, 14, 14, 80) 0 ['block4b_project_bn[0][0]']
block4b_add (Add) (None, 14, 14, 80) 0 ['block4b_drop[0][0]',
'block4a_project_bn[0][0]']
block4c_expand_conv (Conv2 (None, 14, 14, 480) 38400 ['block4b_add[0][0]']
D)
block4c_expand_bn (BatchNo (None, 14, 14, 480) 1920 ['block4c_expand_conv[0][0]']
rmalization)
block4c_expand_activation (None, 14, 14, 480) 0 ['block4c_expand_bn[0][0]']
(Activation)
block4c_dwconv (DepthwiseC (None, 14, 14, 480) 4320 ['block4c_expand_activation[0]
onv2D) [0]']
block4c_bn (BatchNormaliza (None, 14, 14, 480) 1920 ['block4c_dwconv[0][0]']
tion)
block4c_activation (Activa (None, 14, 14, 480) 0 ['block4c_bn[0][0]']
tion)
block4c_se_squeeze (Global (None, 480) 0 ['block4c_activation[0][0]']
AveragePooling2D)
block4c_se_reshape (Reshap (None, 1, 1, 480) 0 ['block4c_se_squeeze[0][0]']
e)
block4c_se_reduce (Conv2D) (None, 1, 1, 20) 9620 ['block4c_se_reshape[0][0]']
block4c_se_expand (Conv2D) (None, 1, 1, 480) 10080 ['block4c_se_reduce[0][0]']
block4c_se_excite (Multipl (None, 14, 14, 480) 0 ['block4c_activation[0][0]',
y) 'block4c_se_expand[0][0]']
block4c_project_conv (Conv (None, 14, 14, 80) 38400 ['block4c_se_excite[0][0]']
2D)
block4c_project_bn (BatchN (None, 14, 14, 80) 320 ['block4c_project_conv[0][0]']
ormalization)
block4c_drop (Dropout) (None, 14, 14, 80) 0 ['block4c_project_bn[0][0]']
block4c_add (Add) (None, 14, 14, 80) 0 ['block4c_drop[0][0]',
'block4b_add[0][0]']
block5a_expand_conv (Conv2 (None, 14, 14, 480) 38400 ['block4c_add[0][0]']
D)
block5a_expand_bn (BatchNo (None, 14, 14, 480) 1920 ['block5a_expand_conv[0][0]']
rmalization)
block5a_expand_activation (None, 14, 14, 480) 0 ['block5a_expand_bn[0][0]']
(Activation)
block5a_dwconv (DepthwiseC (None, 14, 14, 480) 12000 ['block5a_expand_activation[0]
onv2D) [0]']
block5a_bn (BatchNormaliza (None, 14, 14, 480) 1920 ['block5a_dwconv[0][0]']
tion)
block5a_activation (Activa (None, 14, 14, 480) 0 ['block5a_bn[0][0]']
tion)
block5a_se_squeeze (Global (None, 480) 0 ['block5a_activation[0][0]']
AveragePooling2D)
block5a_se_reshape (Reshap (None, 1, 1, 480) 0 ['block5a_se_squeeze[0][0]']
e)
block5a_se_reduce (Conv2D) (None, 1, 1, 20) 9620 ['block5a_se_reshape[0][0]']
block5a_se_expand (Conv2D) (None, 1, 1, 480) 10080 ['block5a_se_reduce[0][0]']
block5a_se_excite (Multipl (None, 14, 14, 480) 0 ['block5a_activation[0][0]',
y) 'block5a_se_expand[0][0]']
block5a_project_conv (Conv (None, 14, 14, 112) 53760 ['block5a_se_excite[0][0]']
2D)
block5a_project_bn (BatchN (None, 14, 14, 112) 448 ['block5a_project_conv[0][0]']
ormalization)
block5b_expand_conv (Conv2 (None, 14, 14, 672) 75264 ['block5a_project_bn[0][0]']
D)
block5b_expand_bn (BatchNo (None, 14, 14, 672) 2688 ['block5b_expand_conv[0][0]']
rmalization)
block5b_expand_activation (None, 14, 14, 672) 0 ['block5b_expand_bn[0][0]']
(Activation)
block5b_dwconv (DepthwiseC (None, 14, 14, 672) 16800 ['block5b_expand_activation[0]
onv2D) [0]']
block5b_bn (BatchNormaliza (None, 14, 14, 672) 2688 ['block5b_dwconv[0][0]']
tion)
block5b_activation (Activa (None, 14, 14, 672) 0 ['block5b_bn[0][0]']
tion)
block5b_se_squeeze (Global (None, 672) 0 ['block5b_activation[0][0]']
AveragePooling2D)
block5b_se_reshape (Reshap (None, 1, 1, 672) 0 ['block5b_se_squeeze[0][0]']
e)
block5b_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block5b_se_reshape[0][0]']
block5b_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block5b_se_reduce[0][0]']
block5b_se_excite (Multipl (None, 14, 14, 672) 0 ['block5b_activation[0][0]',
y) 'block5b_se_expand[0][0]']
block5b_project_conv (Conv (None, 14, 14, 112) 75264 ['block5b_se_excite[0][0]']
2D)
block5b_project_bn (BatchN (None, 14, 14, 112) 448 ['block5b_project_conv[0][0]']
ormalization)
block5b_drop (Dropout) (None, 14, 14, 112) 0 ['block5b_project_bn[0][0]']
block5b_add (Add) (None, 14, 14, 112) 0 ['block5b_drop[0][0]',
'block5a_project_bn[0][0]']
block5c_expand_conv (Conv2 (None, 14, 14, 672) 75264 ['block5b_add[0][0]']
D)
block5c_expand_bn (BatchNo (None, 14, 14, 672) 2688 ['block5c_expand_conv[0][0]']
rmalization)
block5c_expand_activation (None, 14, 14, 672) 0 ['block5c_expand_bn[0][0]']
(Activation)
block5c_dwconv (DepthwiseC (None, 14, 14, 672) 16800 ['block5c_expand_activation[0]
onv2D) [0]']
block5c_bn (BatchNormaliza (None, 14, 14, 672) 2688 ['block5c_dwconv[0][0]']
tion)
block5c_activation (Activa (None, 14, 14, 672) 0 ['block5c_bn[0][0]']
tion)
block5c_se_squeeze (Global (None, 672) 0 ['block5c_activation[0][0]']
AveragePooling2D)
block5c_se_reshape (Reshap (None, 1, 1, 672) 0 ['block5c_se_squeeze[0][0]']
e)
block5c_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block5c_se_reshape[0][0]']
block5c_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block5c_se_reduce[0][0]']
block5c_se_excite (Multipl (None, 14, 14, 672) 0 ['block5c_activation[0][0]',
y) 'block5c_se_expand[0][0]']
block5c_project_conv (Conv (None, 14, 14, 112) 75264 ['block5c_se_excite[0][0]']
2D)
block5c_project_bn (BatchN (None, 14, 14, 112) 448 ['block5c_project_conv[0][0]']
ormalization)
block5c_drop (Dropout) (None, 14, 14, 112) 0 ['block5c_project_bn[0][0]']
block5c_add (Add) (None, 14, 14, 112) 0 ['block5c_drop[0][0]',
'block5b_add[0][0]']
block6a_expand_conv (Conv2 (None, 14, 14, 672) 75264 ['block5c_add[0][0]']
D)
block6a_expand_bn (BatchNo (None, 14, 14, 672) 2688 ['block6a_expand_conv[0][0]']
rmalization)
block6a_expand_activation (None, 14, 14, 672) 0 ['block6a_expand_bn[0][0]']
(Activation)
block6a_dwconv_pad (ZeroPa (None, 17, 17, 672) 0 ['block6a_expand_activation[0]
dding2D) [0]']
block6a_dwconv (DepthwiseC (None, 7, 7, 672) 16800 ['block6a_dwconv_pad[0][0]']
onv2D)
block6a_bn (BatchNormaliza (None, 7, 7, 672) 2688 ['block6a_dwconv[0][0]']
tion)
block6a_activation (Activa (None, 7, 7, 672) 0 ['block6a_bn[0][0]']
tion)
block6a_se_squeeze (Global (None, 672) 0 ['block6a_activation[0][0]']
AveragePooling2D)
block6a_se_reshape (Reshap (None, 1, 1, 672) 0 ['block6a_se_squeeze[0][0]']
e)
block6a_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block6a_se_reshape[0][0]']
block6a_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block6a_se_reduce[0][0]']
block6a_se_excite (Multipl (None, 7, 7, 672) 0 ['block6a_activation[0][0]',
y) 'block6a_se_expand[0][0]']
block6a_project_conv (Conv (None, 7, 7, 192) 129024 ['block6a_se_excite[0][0]']
2D)
block6a_project_bn (BatchN (None, 7, 7, 192) 768 ['block6a_project_conv[0][0]']
ormalization)
block6b_expand_conv (Conv2 (None, 7, 7, 1152) 221184 ['block6a_project_bn[0][0]']
D)
block6b_expand_bn (BatchNo (None, 7, 7, 1152) 4608 ['block6b_expand_conv[0][0]']
rmalization)
block6b_expand_activation (None, 7, 7, 1152) 0 ['block6b_expand_bn[0][0]']
(Activation)
block6b_dwconv (DepthwiseC (None, 7, 7, 1152) 28800 ['block6b_expand_activation[0]
onv2D) [0]']
block6b_bn (BatchNormaliza (None, 7, 7, 1152) 4608 ['block6b_dwconv[0][0]']
tion)
block6b_activation (Activa (None, 7, 7, 1152) 0 ['block6b_bn[0][0]']
tion)
block6b_se_squeeze (Global (None, 1152) 0 ['block6b_activation[0][0]']
AveragePooling2D)
block6b_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6b_se_squeeze[0][0]']
e)
block6b_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6b_se_reshape[0][0]']
block6b_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6b_se_reduce[0][0]']
block6b_se_excite (Multipl (None, 7, 7, 1152) 0 ['block6b_activation[0][0]',
y) 'block6b_se_expand[0][0]']
block6b_project_conv (Conv (None, 7, 7, 192) 221184 ['block6b_se_excite[0][0]']
2D)
block6b_project_bn (BatchN (None, 7, 7, 192) 768 ['block6b_project_conv[0][0]']
ormalization)
block6b_drop (Dropout) (None, 7, 7, 192) 0 ['block6b_project_bn[0][0]']
block6b_add (Add) (None, 7, 7, 192) 0 ['block6b_drop[0][0]',
'block6a_project_bn[0][0]']
block6c_expand_conv (Conv2 (None, 7, 7, 1152) 221184 ['block6b_add[0][0]']
D)
block6c_expand_bn (BatchNo (None, 7, 7, 1152) 4608 ['block6c_expand_conv[0][0]']
rmalization)
block6c_expand_activation (None, 7, 7, 1152) 0 ['block6c_expand_bn[0][0]']
(Activation)
block6c_dwconv (DepthwiseC (None, 7, 7, 1152) 28800 ['block6c_expand_activation[0]
onv2D) [0]']
block6c_bn (BatchNormaliza (None, 7, 7, 1152) 4608 ['block6c_dwconv[0][0]']
tion)
block6c_activation (Activa (None, 7, 7, 1152) 0 ['block6c_bn[0][0]']
tion)
block6c_se_squeeze (Global (None, 1152) 0 ['block6c_activation[0][0]']
AveragePooling2D)
block6c_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6c_se_squeeze[0][0]']
e)
block6c_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6c_se_reshape[0][0]']
block6c_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6c_se_reduce[0][0]']
block6c_se_excite (Multipl (None, 7, 7, 1152) 0 ['block6c_activation[0][0]',
y) 'block6c_se_expand[0][0]']
block6c_project_conv (Conv (None, 7, 7, 192) 221184 ['block6c_se_excite[0][0]']
2D)
block6c_project_bn (BatchN (None, 7, 7, 192) 768 ['block6c_project_conv[0][0]']
ormalization)
block6c_drop (Dropout) (None, 7, 7, 192) 0 ['block6c_project_bn[0][0]']
block6c_add (Add) (None, 7, 7, 192) 0 ['block6c_drop[0][0]',
'block6b_add[0][0]']
block6d_expand_conv (Conv2 (None, 7, 7, 1152) 221184 ['block6c_add[0][0]']
D)
block6d_expand_bn (BatchNo (None, 7, 7, 1152) 4608 ['block6d_expand_conv[0][0]']
rmalization)
block6d_expand_activation (None, 7, 7, 1152) 0 ['block6d_expand_bn[0][0]']
(Activation)
block6d_dwconv (DepthwiseC (None, 7, 7, 1152) 28800 ['block6d_expand_activation[0]
onv2D) [0]']
block6d_bn (BatchNormaliza (None, 7, 7, 1152) 4608 ['block6d_dwconv[0][0]']
tion)
block6d_activation (Activa (None, 7, 7, 1152) 0 ['block6d_bn[0][0]']
tion)
block6d_se_squeeze (Global (None, 1152) 0 ['block6d_activation[0][0]']
AveragePooling2D)
block6d_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6d_se_squeeze[0][0]']
e)
block6d_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6d_se_reshape[0][0]']
block6d_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6d_se_reduce[0][0]']
block6d_se_excite (Multipl (None, 7, 7, 1152) 0 ['block6d_activation[0][0]',
y) 'block6d_se_expand[0][0]']
block6d_project_conv (Conv (None, 7, 7, 192) 221184 ['block6d_se_excite[0][0]']
2D)
block6d_project_bn (BatchN (None, 7, 7, 192) 768 ['block6d_project_conv[0][0]']
ormalization)
block6d_drop (Dropout) (None, 7, 7, 192) 0 ['block6d_project_bn[0][0]']
block6d_add (Add) (None, 7, 7, 192) 0 ['block6d_drop[0][0]',
'block6c_add[0][0]']
block7a_expand_conv (Conv2 (None, 7, 7, 1152) 221184 ['block6d_add[0][0]']
D)
block7a_expand_bn (BatchNo (None, 7, 7, 1152) 4608 ['block7a_expand_conv[0][0]']
rmalization)
block7a_expand_activation (None, 7, 7, 1152) 0 ['block7a_expand_bn[0][0]']
(Activation)
block7a_dwconv (DepthwiseC (None, 7, 7, 1152) 10368 ['block7a_expand_activation[0]
onv2D) [0]']
block7a_bn (BatchNormaliza (None, 7, 7, 1152) 4608 ['block7a_dwconv[0][0]']
tion)
block7a_activation (Activa (None, 7, 7, 1152) 0 ['block7a_bn[0][0]']
tion)
block7a_se_squeeze (Global (None, 1152) 0 ['block7a_activation[0][0]']
AveragePooling2D)
block7a_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block7a_se_squeeze[0][0]']
e)
block7a_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block7a_se_reshape[0][0]']
block7a_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block7a_se_reduce[0][0]']
block7a_se_excite (Multipl (None, 7, 7, 1152) 0 ['block7a_activation[0][0]',
y) 'block7a_se_expand[0][0]']
block7a_project_conv (Conv (None, 7, 7, 320) 368640 ['block7a_se_excite[0][0]']
2D)
block7a_project_bn (BatchN (None, 7, 7, 320) 1280 ['block7a_project_conv[0][0]']
ormalization)
top_conv (Conv2D) (None, 7, 7, 1280) 409600 ['block7a_project_bn[0][0]']
top_bn (BatchNormalization (None, 7, 7, 1280) 5120 ['top_conv[0][0]']
)
top_activation (Activation (None, 7, 7, 1280) 0 ['top_bn[0][0]']
)
global_average_pooling2d ( (None, 1280) 0 ['top_activation[0][0]']
GlobalAveragePooling2D)
dense_8 (Dense) (None, 256) 327936 ['global_average_pooling2d[0][
0]']
dense_9 (Dense) (None, 1) 257 ['dense_8[0][0]']
==================================================================================================
Total params: 4377764 (16.70 MB)
Trainable params: 1679153 (6.41 MB)
Non-trainable params: 2698611 (10.29 MB)
__________________________________________________________________________________________________
# Compile the model with binary cross-entropy loss and an optimizer
tl_model.compile(loss='binary_crossentropy', optimizer=Adam(learning_rate=0.0001), metrics=['accuracy'])
# Train the model on the dataset
tl_history = tl_model.fit(datagen.flow(X_train[:6000], y_train[:6000], batch_size=10), epochs=10, validation_data=(X_test[:6000], y_test[:6000]))
Epoch 1/10 600/600 [==============================] - 79s 114ms/step - loss: 0.6930 - accuracy: 0.5573 - val_loss: 0.6317 - val_accuracy: 0.7202 Epoch 2/10 600/600 [==============================] - 64s 106ms/step - loss: 0.6861 - accuracy: 0.5708 - val_loss: 0.6365 - val_accuracy: 0.7202 Epoch 3/10 600/600 [==============================] - 64s 106ms/step - loss: 0.6844 - accuracy: 0.5747 - val_loss: 0.6474 - val_accuracy: 0.7202 Epoch 4/10 600/600 [==============================] - 64s 106ms/step - loss: 0.6834 - accuracy: 0.5760 - val_loss: 0.6380 - val_accuracy: 0.7202 Epoch 5/10 600/600 [==============================] - 64s 107ms/step - loss: 0.6833 - accuracy: 0.5760 - val_loss: 0.6599 - val_accuracy: 0.7202 Epoch 6/10 600/600 [==============================] - 64s 107ms/step - loss: 0.6829 - accuracy: 0.5765 - val_loss: 0.6452 - val_accuracy: 0.7202 Epoch 7/10 600/600 [==============================] - 64s 106ms/step - loss: 0.6826 - accuracy: 0.5757 - val_loss: 0.6434 - val_accuracy: 0.7202 Epoch 8/10 600/600 [==============================] - 64s 107ms/step - loss: 0.6826 - accuracy: 0.5763 - val_loss: 0.6399 - val_accuracy: 0.7202 Epoch 9/10 600/600 [==============================] - 64s 106ms/step - loss: 0.6825 - accuracy: 0.5765 - val_loss: 0.6438 - val_accuracy: 0.7202 Epoch 10/10 600/600 [==============================] - 64s 107ms/step - loss: 0.6819 - accuracy: 0.5758 - val_loss: 0.6589 - val_accuracy: 0.7202
# Create a single figure with two subplots, arranged horizontally
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Plot training loss and validation loss in the first subplot
ax1.plot(tl_history.history['val_accuracy'], label='validation accuracy')
ax1.plot(tl_history.history['accuracy'], label='Training Accuracy')
ax1.set_title('Training Loss & Accuracy')
ax1.legend()
# Plot training accuracy and validation accuracy in the second subplot
ax2.plot(tl_history.history['val_loss'], label='Validation Loss')
ax2.plot(tl_history.history['loss'], label='Training Loss')
ax2.set_title('Validation Loss & Accuracy')
ax2.legend()
plt.tight_layout()
plt.show()
Using MobileNetV2 to see if there is any improvemnet over ResNet50 and EfficientNetB0.
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense, Dropout
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
# Load the MobileNetV2 model pre-trained on ImageNet data
mn_base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Freeze the layers of the base model
for layer in mn_base_model.layers:
layer.trainable = False
# Create a new model on top of the base model
x = GlobalAveragePooling2D()(mn_base_model.output)
x = Dense(128, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(1, activation='sigmoid')(x)
mn_model = Model(inputs=mn_base_model.input, outputs=predictions)
# Compile the model with an appropriate optimizer, loss function, and metrics
optimizer = Adam(learning_rate=0.0001)
mn_model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])
# Print the model summary
mn_model.summary()
Model: "model_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_4 (InputLayer) [(None, 224, 224, 3)] 0 []
Conv1 (Conv2D) (None, 112, 112, 32) 864 ['input_4[0][0]']
bn_Conv1 (BatchNormalizati (None, 112, 112, 32) 128 ['Conv1[0][0]']
on)
Conv1_relu (ReLU) (None, 112, 112, 32) 0 ['bn_Conv1[0][0]']
expanded_conv_depthwise (D (None, 112, 112, 32) 288 ['Conv1_relu[0][0]']
epthwiseConv2D)
expanded_conv_depthwise_BN (None, 112, 112, 32) 128 ['expanded_conv_depthwise[0][0
(BatchNormalization) ]']
expanded_conv_depthwise_re (None, 112, 112, 32) 0 ['expanded_conv_depthwise_BN[0
lu (ReLU) ][0]']
expanded_conv_project (Con (None, 112, 112, 16) 512 ['expanded_conv_depthwise_relu
v2D) [0][0]']
expanded_conv_project_BN ( (None, 112, 112, 16) 64 ['expanded_conv_project[0][0]'
BatchNormalization) ]
block_1_expand (Conv2D) (None, 112, 112, 96) 1536 ['expanded_conv_project_BN[0][
0]']
block_1_expand_BN (BatchNo (None, 112, 112, 96) 384 ['block_1_expand[0][0]']
rmalization)
block_1_expand_relu (ReLU) (None, 112, 112, 96) 0 ['block_1_expand_BN[0][0]']
block_1_pad (ZeroPadding2D (None, 113, 113, 96) 0 ['block_1_expand_relu[0][0]']
)
block_1_depthwise (Depthwi (None, 56, 56, 96) 864 ['block_1_pad[0][0]']
seConv2D)
block_1_depthwise_BN (Batc (None, 56, 56, 96) 384 ['block_1_depthwise[0][0]']
hNormalization)
block_1_depthwise_relu (Re (None, 56, 56, 96) 0 ['block_1_depthwise_BN[0][0]']
LU)
block_1_project (Conv2D) (None, 56, 56, 24) 2304 ['block_1_depthwise_relu[0][0]
']
block_1_project_BN (BatchN (None, 56, 56, 24) 96 ['block_1_project[0][0]']
ormalization)
block_2_expand (Conv2D) (None, 56, 56, 144) 3456 ['block_1_project_BN[0][0]']
block_2_expand_BN (BatchNo (None, 56, 56, 144) 576 ['block_2_expand[0][0]']
rmalization)
block_2_expand_relu (ReLU) (None, 56, 56, 144) 0 ['block_2_expand_BN[0][0]']
block_2_depthwise (Depthwi (None, 56, 56, 144) 1296 ['block_2_expand_relu[0][0]']
seConv2D)
block_2_depthwise_BN (Batc (None, 56, 56, 144) 576 ['block_2_depthwise[0][0]']
hNormalization)
block_2_depthwise_relu (Re (None, 56, 56, 144) 0 ['block_2_depthwise_BN[0][0]']
LU)
block_2_project (Conv2D) (None, 56, 56, 24) 3456 ['block_2_depthwise_relu[0][0]
']
block_2_project_BN (BatchN (None, 56, 56, 24) 96 ['block_2_project[0][0]']
ormalization)
block_2_add (Add) (None, 56, 56, 24) 0 ['block_1_project_BN[0][0]',
'block_2_project_BN[0][0]']
block_3_expand (Conv2D) (None, 56, 56, 144) 3456 ['block_2_add[0][0]']
block_3_expand_BN (BatchNo (None, 56, 56, 144) 576 ['block_3_expand[0][0]']
rmalization)
block_3_expand_relu (ReLU) (None, 56, 56, 144) 0 ['block_3_expand_BN[0][0]']
block_3_pad (ZeroPadding2D (None, 57, 57, 144) 0 ['block_3_expand_relu[0][0]']
)
block_3_depthwise (Depthwi (None, 28, 28, 144) 1296 ['block_3_pad[0][0]']
seConv2D)
block_3_depthwise_BN (Batc (None, 28, 28, 144) 576 ['block_3_depthwise[0][0]']
hNormalization)
block_3_depthwise_relu (Re (None, 28, 28, 144) 0 ['block_3_depthwise_BN[0][0]']
LU)
block_3_project (Conv2D) (None, 28, 28, 32) 4608 ['block_3_depthwise_relu[0][0]
']
block_3_project_BN (BatchN (None, 28, 28, 32) 128 ['block_3_project[0][0]']
ormalization)
block_4_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_3_project_BN[0][0]']
block_4_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_4_expand[0][0]']
rmalization)
block_4_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_4_expand_BN[0][0]']
block_4_depthwise (Depthwi (None, 28, 28, 192) 1728 ['block_4_expand_relu[0][0]']
seConv2D)
block_4_depthwise_BN (Batc (None, 28, 28, 192) 768 ['block_4_depthwise[0][0]']
hNormalization)
block_4_depthwise_relu (Re (None, 28, 28, 192) 0 ['block_4_depthwise_BN[0][0]']
LU)
block_4_project (Conv2D) (None, 28, 28, 32) 6144 ['block_4_depthwise_relu[0][0]
']
block_4_project_BN (BatchN (None, 28, 28, 32) 128 ['block_4_project[0][0]']
ormalization)
block_4_add (Add) (None, 28, 28, 32) 0 ['block_3_project_BN[0][0]',
'block_4_project_BN[0][0]']
block_5_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_4_add[0][0]']
block_5_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_5_expand[0][0]']
rmalization)
block_5_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_5_expand_BN[0][0]']
block_5_depthwise (Depthwi (None, 28, 28, 192) 1728 ['block_5_expand_relu[0][0]']
seConv2D)
block_5_depthwise_BN (Batc (None, 28, 28, 192) 768 ['block_5_depthwise[0][0]']
hNormalization)
block_5_depthwise_relu (Re (None, 28, 28, 192) 0 ['block_5_depthwise_BN[0][0]']
LU)
block_5_project (Conv2D) (None, 28, 28, 32) 6144 ['block_5_depthwise_relu[0][0]
']
block_5_project_BN (BatchN (None, 28, 28, 32) 128 ['block_5_project[0][0]']
ormalization)
block_5_add (Add) (None, 28, 28, 32) 0 ['block_4_add[0][0]',
'block_5_project_BN[0][0]']
block_6_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_5_add[0][0]']
block_6_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_6_expand[0][0]']
rmalization)
block_6_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_6_expand_BN[0][0]']
block_6_pad (ZeroPadding2D (None, 29, 29, 192) 0 ['block_6_expand_relu[0][0]']
)
block_6_depthwise (Depthwi (None, 14, 14, 192) 1728 ['block_6_pad[0][0]']
seConv2D)
block_6_depthwise_BN (Batc (None, 14, 14, 192) 768 ['block_6_depthwise[0][0]']
hNormalization)
block_6_depthwise_relu (Re (None, 14, 14, 192) 0 ['block_6_depthwise_BN[0][0]']
LU)
block_6_project (Conv2D) (None, 14, 14, 64) 12288 ['block_6_depthwise_relu[0][0]
']
block_6_project_BN (BatchN (None, 14, 14, 64) 256 ['block_6_project[0][0]']
ormalization)
block_7_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_6_project_BN[0][0]']
block_7_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_7_expand[0][0]']
rmalization)
block_7_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_7_expand_BN[0][0]']
block_7_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_7_expand_relu[0][0]']
seConv2D)
block_7_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_7_depthwise[0][0]']
hNormalization)
block_7_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_7_depthwise_BN[0][0]']
LU)
block_7_project (Conv2D) (None, 14, 14, 64) 24576 ['block_7_depthwise_relu[0][0]
']
block_7_project_BN (BatchN (None, 14, 14, 64) 256 ['block_7_project[0][0]']
ormalization)
block_7_add (Add) (None, 14, 14, 64) 0 ['block_6_project_BN[0][0]',
'block_7_project_BN[0][0]']
block_8_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_7_add[0][0]']
block_8_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_8_expand[0][0]']
rmalization)
block_8_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_8_expand_BN[0][0]']
block_8_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_8_expand_relu[0][0]']
seConv2D)
block_8_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_8_depthwise[0][0]']
hNormalization)
block_8_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_8_depthwise_BN[0][0]']
LU)
block_8_project (Conv2D) (None, 14, 14, 64) 24576 ['block_8_depthwise_relu[0][0]
']
block_8_project_BN (BatchN (None, 14, 14, 64) 256 ['block_8_project[0][0]']
ormalization)
block_8_add (Add) (None, 14, 14, 64) 0 ['block_7_add[0][0]',
'block_8_project_BN[0][0]']
block_9_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_8_add[0][0]']
block_9_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_9_expand[0][0]']
rmalization)
block_9_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_9_expand_BN[0][0]']
block_9_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_9_expand_relu[0][0]']
seConv2D)
block_9_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_9_depthwise[0][0]']
hNormalization)
block_9_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_9_depthwise_BN[0][0]']
LU)
block_9_project (Conv2D) (None, 14, 14, 64) 24576 ['block_9_depthwise_relu[0][0]
']
block_9_project_BN (BatchN (None, 14, 14, 64) 256 ['block_9_project[0][0]']
ormalization)
block_9_add (Add) (None, 14, 14, 64) 0 ['block_8_add[0][0]',
'block_9_project_BN[0][0]']
block_10_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_9_add[0][0]']
block_10_expand_BN (BatchN (None, 14, 14, 384) 1536 ['block_10_expand[0][0]']
ormalization)
block_10_expand_relu (ReLU (None, 14, 14, 384) 0 ['block_10_expand_BN[0][0]']
)
block_10_depthwise (Depthw (None, 14, 14, 384) 3456 ['block_10_expand_relu[0][0]']
iseConv2D)
block_10_depthwise_BN (Bat (None, 14, 14, 384) 1536 ['block_10_depthwise[0][0]']
chNormalization)
block_10_depthwise_relu (R (None, 14, 14, 384) 0 ['block_10_depthwise_BN[0][0]'
eLU) ]
block_10_project (Conv2D) (None, 14, 14, 96) 36864 ['block_10_depthwise_relu[0][0
]']
block_10_project_BN (Batch (None, 14, 14, 96) 384 ['block_10_project[0][0]']
Normalization)
block_11_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_10_project_BN[0][0]']
block_11_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_11_expand[0][0]']
ormalization)
block_11_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_11_expand_BN[0][0]']
)
block_11_depthwise (Depthw (None, 14, 14, 576) 5184 ['block_11_expand_relu[0][0]']
iseConv2D)
block_11_depthwise_BN (Bat (None, 14, 14, 576) 2304 ['block_11_depthwise[0][0]']
chNormalization)
block_11_depthwise_relu (R (None, 14, 14, 576) 0 ['block_11_depthwise_BN[0][0]'
eLU) ]
block_11_project (Conv2D) (None, 14, 14, 96) 55296 ['block_11_depthwise_relu[0][0
]']
block_11_project_BN (Batch (None, 14, 14, 96) 384 ['block_11_project[0][0]']
Normalization)
block_11_add (Add) (None, 14, 14, 96) 0 ['block_10_project_BN[0][0]',
'block_11_project_BN[0][0]']
block_12_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_11_add[0][0]']
block_12_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_12_expand[0][0]']
ormalization)
block_12_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_12_expand_BN[0][0]']
)
block_12_depthwise (Depthw (None, 14, 14, 576) 5184 ['block_12_expand_relu[0][0]']
iseConv2D)
block_12_depthwise_BN (Bat (None, 14, 14, 576) 2304 ['block_12_depthwise[0][0]']
chNormalization)
block_12_depthwise_relu (R (None, 14, 14, 576) 0 ['block_12_depthwise_BN[0][0]'
eLU) ]
block_12_project (Conv2D) (None, 14, 14, 96) 55296 ['block_12_depthwise_relu[0][0
]']
block_12_project_BN (Batch (None, 14, 14, 96) 384 ['block_12_project[0][0]']
Normalization)
block_12_add (Add) (None, 14, 14, 96) 0 ['block_11_add[0][0]',
'block_12_project_BN[0][0]']
block_13_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_12_add[0][0]']
block_13_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_13_expand[0][0]']
ormalization)
block_13_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_13_expand_BN[0][0]']
)
block_13_pad (ZeroPadding2 (None, 15, 15, 576) 0 ['block_13_expand_relu[0][0]']
D)
block_13_depthwise (Depthw (None, 7, 7, 576) 5184 ['block_13_pad[0][0]']
iseConv2D)
block_13_depthwise_BN (Bat (None, 7, 7, 576) 2304 ['block_13_depthwise[0][0]']
chNormalization)
block_13_depthwise_relu (R (None, 7, 7, 576) 0 ['block_13_depthwise_BN[0][0]'
eLU) ]
block_13_project (Conv2D) (None, 7, 7, 160) 92160 ['block_13_depthwise_relu[0][0
]']
block_13_project_BN (Batch (None, 7, 7, 160) 640 ['block_13_project[0][0]']
Normalization)
block_14_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_13_project_BN[0][0]']
block_14_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_14_expand[0][0]']
ormalization)
block_14_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_14_expand_BN[0][0]']
)
block_14_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_14_expand_relu[0][0]']
iseConv2D)
block_14_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_14_depthwise[0][0]']
chNormalization)
block_14_depthwise_relu (R (None, 7, 7, 960) 0 ['block_14_depthwise_BN[0][0]'
eLU) ]
block_14_project (Conv2D) (None, 7, 7, 160) 153600 ['block_14_depthwise_relu[0][0
]']
block_14_project_BN (Batch (None, 7, 7, 160) 640 ['block_14_project[0][0]']
Normalization)
block_14_add (Add) (None, 7, 7, 160) 0 ['block_13_project_BN[0][0]',
'block_14_project_BN[0][0]']
block_15_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_14_add[0][0]']
block_15_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_15_expand[0][0]']
ormalization)
block_15_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_15_expand_BN[0][0]']
)
block_15_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_15_expand_relu[0][0]']
iseConv2D)
block_15_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_15_depthwise[0][0]']
chNormalization)
block_15_depthwise_relu (R (None, 7, 7, 960) 0 ['block_15_depthwise_BN[0][0]'
eLU) ]
block_15_project (Conv2D) (None, 7, 7, 160) 153600 ['block_15_depthwise_relu[0][0
]']
block_15_project_BN (Batch (None, 7, 7, 160) 640 ['block_15_project[0][0]']
Normalization)
block_15_add (Add) (None, 7, 7, 160) 0 ['block_14_add[0][0]',
'block_15_project_BN[0][0]']
block_16_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_15_add[0][0]']
block_16_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_16_expand[0][0]']
ormalization)
block_16_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_16_expand_BN[0][0]']
)
block_16_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_16_expand_relu[0][0]']
iseConv2D)
block_16_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_16_depthwise[0][0]']
chNormalization)
block_16_depthwise_relu (R (None, 7, 7, 960) 0 ['block_16_depthwise_BN[0][0]'
eLU) ]
block_16_project (Conv2D) (None, 7, 7, 320) 307200 ['block_16_depthwise_relu[0][0
]']
block_16_project_BN (Batch (None, 7, 7, 320) 1280 ['block_16_project[0][0]']
Normalization)
Conv_1 (Conv2D) (None, 7, 7, 1280) 409600 ['block_16_project_BN[0][0]']
Conv_1_bn (BatchNormalizat (None, 7, 7, 1280) 5120 ['Conv_1[0][0]']
ion)
out_relu (ReLU) (None, 7, 7, 1280) 0 ['Conv_1_bn[0][0]']
global_average_pooling2d_1 (None, 1280) 0 ['out_relu[0][0]']
(GlobalAveragePooling2D)
dense_10 (Dense) (None, 128) 163968 ['global_average_pooling2d_1[0
][0]']
dropout_6 (Dropout) (None, 128) 0 ['dense_10[0][0]']
dense_11 (Dense) (None, 1) 129 ['dropout_6[0][0]']
==================================================================================================
Total params: 2422081 (9.24 MB)
Trainable params: 164097 (641.00 KB)
Non-trainable params: 2257984 (8.61 MB)
__________________________________________________________________________________________________
# Train the model on the dataset
mn_history = mn_model.fit(datagen.flow(X_train[:6000], y_train[:6000], batch_size=10), epochs=10, validation_data=(X_test[:6000], y_test[:6000]))
Epoch 1/10 600/600 [==============================] - 70s 112ms/step - loss: 0.5608 - accuracy: 0.7233 - val_loss: 0.5354 - val_accuracy: 0.7375 Epoch 2/10 600/600 [==============================] - 62s 103ms/step - loss: 0.5176 - accuracy: 0.7527 - val_loss: 0.5195 - val_accuracy: 0.7473 Epoch 3/10 600/600 [==============================] - 62s 103ms/step - loss: 0.5070 - accuracy: 0.7593 - val_loss: 0.4936 - val_accuracy: 0.7682 Epoch 4/10 600/600 [==============================] - 62s 104ms/step - loss: 0.4958 - accuracy: 0.7610 - val_loss: 0.4642 - val_accuracy: 0.7778 Epoch 5/10 600/600 [==============================] - 62s 104ms/step - loss: 0.4899 - accuracy: 0.7695 - val_loss: 0.4849 - val_accuracy: 0.7690 Epoch 6/10 600/600 [==============================] - 62s 103ms/step - loss: 0.4908 - accuracy: 0.7637 - val_loss: 0.4981 - val_accuracy: 0.7607 Epoch 7/10 600/600 [==============================] - 62s 103ms/step - loss: 0.4789 - accuracy: 0.7730 - val_loss: 0.4711 - val_accuracy: 0.7752 Epoch 8/10 600/600 [==============================] - 62s 104ms/step - loss: 0.4823 - accuracy: 0.7703 - val_loss: 0.4907 - val_accuracy: 0.7623 Epoch 9/10 600/600 [==============================] - 62s 103ms/step - loss: 0.4798 - accuracy: 0.7765 - val_loss: 0.4719 - val_accuracy: 0.7752 Epoch 10/10 600/600 [==============================] - 62s 104ms/step - loss: 0.4752 - accuracy: 0.7788 - val_loss: 0.5032 - val_accuracy: 0.7605
# Create a single figure with two subplots, arranged horizontally
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Plot training loss and validation loss in the first subplot
ax1.plot(mn_history.history['val_accuracy'], label='Validation Accuracy')
ax1.plot(mn_history.history['accuracy'], label='Training Accuracy')
ax1.set_title('Training Loss & Accuracy')
ax1.legend()
# Plot training accuracy and validation accuracy in the second subplot
ax2.plot(mn_history.history['val_loss'], label='Validation Loss')
ax2.plot(mn_history.history['loss'], label='Training Loss')
ax2.set_title('Validation Loss & Accuracy')
ax2.legend()
plt.tight_layout()
plt.show()
# Create a DataFrame with model names and accuracies
# training data points
training_data = {
'Model': ['ResNet50', 'Tuned_ResNet50', 'EfficientNetB0', 'MobileNetV2'],
'Accuracy': [72, 70, 57.5, 78]
}
df1 = pd.DataFrame(training_data)
# validation data points
validation_data = {
'Model': ['ResNet50', 'Tuned_ResNet50', 'EfficientNetB0', 'MobileNetV2'],
'Accuracy': [76.1, 62, 72, 76]
}
df2 = pd.DataFrame(validation_data)
# Create a bar plot using seaborn
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
sns.set(style="whitegrid")
# Training Accuracy Comparison
ax1 = sns.barplot(x="Model", y="Accuracy", data=df1, palette="Blues_d", ax=ax1)
ax1.set(ylabel="Accuracy (%)", title="Training Accuracy Comparison")
# Add data labels on the bars
for p1 in ax1.patches:
ax1.annotate(f'{p1.get_height()}%', (p1.get_x() + p1.get_width() / 2., p1.get_height()),
ha='center', va='center', fontsize=12, color='black', xytext=(0, 10),
textcoords='offset points')
# Validation Accuracy Comparison
ax2 = sns.barplot(x="Model", y="Accuracy", data=df2, palette="Blues_d", ax=ax2)
ax2.set(ylabel="Accuracy (%)", title="Validation Accuracy Comparison")
# Add data labels on the bars
for p2 in ax2.patches:
ax2.annotate(f'{p2.get_height()}%', (p2.get_x() + p2.get_width() / 2., p2.get_height()),
ha='center', va='center', fontsize=12, color='black', xytext=(0, 10),
textcoords='offset points')
plt.tight_layout()
plt.show()
We can see clear evidence from the comparison chart, MobileNetV2 model is giving us very good results. Will evaluate MobileNetV2 with unknown data.
Evaluating MobileNetV2 model with unknown Image to check how it is reacting with new or unknown images.
# unknown sample images for Pneumonia detection
image_with_pneumonia = '/tmp/stage_2_train_images/0004cfab-14fd-4e49-80ba-63a80b6bddd6.dcm'
image_without_pneumonia = '/tmp/stage_2_train_images/00436515-870c-4b36-a041-de91049b9ab4.dcm'
print("With Pneumonia:", image_with_pneumonia,"\nWithout Pneumonia: ",image_without_pneumonia)
With Pneumonia: /tmp/stage_2_train_images/0004cfab-14fd-4e49-80ba-63a80b6bddd6.dcm Without Pneumonia: /tmp/stage_2_train_images/00436515-870c-4b36-a041-de91049b9ab4.dcm
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input, decode_predictions
from tensorflow.keras.preprocessing import image
# Load and preprocess the unknown image
# Pneumonia is not present
# image_filepath = image_with_pneumonia
# Pneumonia is present
image_filepath = image_without_pneumonia
# Load and preprocess the DICOM image
dcm_fdata = dcm.read_file(image_filepath)
# You may need to extract the pixel data and convert it to an array
image_data = dcm_fdata.pixel_array
# Resize image to 224x224
img = cv2.resize(image_data, (224, 224))
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
chest_xray = img
img = np.expand_dims(img, axis=0)
img = preprocess_input(img)
# Make predictions
predictions = mn_model.predict(img)
predictions, img.shape
1/1 [==============================] - 1s 845ms/step
(array([[0.8877876]], dtype=float32), (1, 224, 224, 3))
# predicted probability
predicted_probability = predictions[0][0]
# Define a threshold (you can adjust this value)
threshold = 0.5
# Interpret the prediction
if predicted_probability > threshold:
prediction_label = "Pneumonia Present"
else:
prediction_label = "No Pneumonia"
# Print the result
print(f"Predicted class: {prediction_label}")
print(f"Confidence: {predicted_probability}")
sns.set(style="white")
plt.title(prediction_label)
plt.imshow(chest_xray)
Predicted class: Pneumonia Present Confidence: 0.8877875804901123
<matplotlib.image.AxesImage at 0x7dcdbc0d7f10>
Model Accuracy is looks good on the sample data.
Model Accuracy of MobileNetV2 is ~77%
Model Validation Accuracy of MobileNetV2 is ~77%
MobileNetV2 model look more appealing to me. It is giving more convincing results. Which accuracy is more close to the existing solution of pneumonia detection.
I have used couple of sample unkown images with Pneumonia or without Pneumonia to check if our model ig giving us correct detection result or not. I am getting good results here too.
See results of the two above cells, we can just comment and uncomment the lines to check the detection results.
The first step in RCNN is Selective search. Lets initialize Selective search using createSelectiveSearchSegmentation() class of opencv library.
cv2.setUseOptimized(True);
ss_object = cv2.ximgproc.segmentation.createSelectiveSearchSegmentation()
Set one image as the base for selective search using setBaseImage(image)
Selective search segmentation function uses hierarchical clustering to group pixels and then combine them into one based on color, texture or composition. The following is an implementation of the search code on the base image.
input_image = resized_images[4]
ss_object.setBaseImage(input_image)
ss_object.switchToSelectiveSearchFast() #this method of createSelectiveSearchSegmentation()
rects = ss_object.process() # The output of the process is a set of a potential ROI’s, depending on the size of the base image
new_input_image = input_image.copy() # create copy of the base image
for i, rect in (enumerate(rects)):
x, y, w, h = rect
cv2.rectangle(new_input_image, (x, y), (x+w, y+h), (0, 255, 0), 1, cv2.LINE_AA)
# plt.figure()
plt.imshow(new_input_image)
<matplotlib.image.AxesImage at 0x7f5e73c820b0>
Loop over the image folder and set each image one by one as the base for selective search using setBaseImage(image) and get the proposed regions
Initialising fast selective search and getting proposed regions using class switchToSelectiveSearchFast() and process().
Iterating over all the first 2000 results passed by selective search and calculating IOU of the proposed region and annotated region using the get_iou() function created above.
Now as one image can have many negative sample (i.e. background) and just some positive sample (i.e. airplane) so we need to make sure that we have good proportion of both positive and negative sample to train our model. Therefore we have set that we will collect maximum of 30 negative sample (i.e. background) and positive sample (i.e. airplane) from one image.
train_data=[]
train_labels_data=[]
def calculate_iou(bb_1, bb_2):
'''
Now we are initialising the function to calculate IOU (Intersection Over Union)
of the ground truth box from the box computed by selective search.
To divide the generated ROI’s, for example, we can use a metric called IoU.
It’s defined as the intersection area divided by area of the union of a predicted
bounding box and ground-truth box.
'''
#assert bb_1['x1'] < bb_1['x2'] # The assert keyword lets you test if a condition in your code returns True,
#assert bb_1['y1'] < bb_1['y2'] # if not, the program will raise an AssertionError.
#assert bb_2['x1'] < bb_2['x2']
#assert bb_2['y1'] < bb_2['y2']
x_left = max(bb_1['x1'], bb_2['x1'])
y_top = max(bb_1['y1'], bb_2['y1'])
x_right = min(bb_1['x2'], bb_2['x2'])
y_bottom = min(bb_1['y2'], bb_2['y2'])
if x_right < x_left or y_bottom < y_top:
return 0.0
intersection = (x_right - x_left) * (y_bottom - y_top)
bb_1_area = (bb_1['x2'] - bb_1['x1']) * (bb_1['y2'] - bb_1['y1'])
bb_2_area = (bb_2['x2'] - bb_2['x1']) * (bb_2['y2'] - bb_2['y1'])
iou_value = intersection / float(bb_1_area + bb_2_area - intersection)
#assert iou_value >= 0.0
#assert iou_value <= 1.0
return iou_value
MAX_REGION_PROPOSALS = 2000
for i, e in enumerate(resized_images[:500]):
image = resized_images[i]
#print(i)
coordinates=[]
x1 = pneumonia_df['x'][i]*(224.0/1024.0)
y1 = pneumonia_df['y'][i]*(224.0/1024.0)
x2 = pneumonia_df['width'][i]*(224.0/1024.0)
y2 = pneumonia_df['height'][i]*(224.0/1024.0)
coordinates.append({"x1":x1,"x2":x2,"y1":y1,"y2":y2})
ss_object.setBaseImage(image)
ss_object.switchToSelectiveSearchFast()
ss_results = ss_object.process()
image_new = image.copy()
min_positive_samples = 0
min_negative_samples = 0
flag = 0
foreground_flag = 0
background_flag = 0
for region,ss_coordinate in enumerate(ss_results):
#print(region)
if region < MAX_REGION_PROPOSALS and flag == 0: # Iterating over all the first 2000 results only which are passed by selective search
for value in coordinates:
x,y,w,h = ss_coordinate
iou = calculate_iou(value,{"x1":x,"x2":x+w,"y1":y,"y2":y+h})
if min_positive_samples < 30:
if iou > 0.70:
mobile_obj_img = image_new[y:y+h,x:x+w]
resized_image_2 = cv2.resize(mobile_obj_img, (224,224), interpolation = cv2.INTER_AREA)
train_data.append(resized_image_2)
train_labels_data.append(1)
min_positive_samples += 1
else :
foreground_flag = 1
if min_negative_samples < 30:
if iou < 0.3:
mobile_obj_img = image_new[y:y+h,x:x+w]
resized_image_2 = cv2.resize(mobile_obj_img, (224,224), interpolation = cv2.INTER_AREA)
#print(resized_image_2.shape)
train_data.append(resized_image_2)
#print(train_data)
train_labels_data.append(0)
min_negative_samples += 1
else :
background_flag = 1
if foreground_flag == 1 and background_flag == 1:
print("inside")
flag = 1
Now we will do transfer learning on the imagenet weight. We will import VGG16 model and also put the imagenet weight in the model.
# Trying VGG16
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
# Using VGG16 as a base model
vgg_model = VGG16(weights='imagenet', include_top=True)
vgg_model.summary()
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 224, 224, 3)] 0
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
flatten (Flatten) (None, 25088) 0
fc1 (Dense) (None, 4096) 102764544
fc2 (Dense) (None, 4096) 16781312
predictions (Dense) (None, 1000) 4097000
=================================================================
Total params: 138357544 (527.79 MB)
Trainable params: 138357544 (527.79 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
In this part in the loop we are freezing the first 15 layers of the model. After that we are taking out the second last layer of the model and then adding a 2 unit softmax dense layer as we have just 2 classes to predict i.e. foreground or background. After that we are compiling the model using Adam optimizer with learning rate of 0.001. We are using categorical_crossentropy as loss since the output of the model is categorical. Finally the summary of the model is printed using summary() method of keras.
for layers in (vgg_model.layers)[:15]:
print(layers)
layers.trainable = False
<keras.src.engine.input_layer.InputLayer object at 0x7f5e70cab8b0> <keras.src.layers.convolutional.conv2d.Conv2D object at 0x7f5e70ca9a50> <keras.src.layers.convolutional.conv2d.Conv2D object at 0x7f5e70ca8550> <keras.src.layers.pooling.max_pooling2d.MaxPooling2D object at 0x7f5e70caaa10> <keras.src.layers.convolutional.conv2d.Conv2D object at 0x7f5e70ca8520> <keras.src.layers.convolutional.conv2d.Conv2D object at 0x7f5e70caab00> <keras.src.layers.pooling.max_pooling2d.MaxPooling2D object at 0x7f5e6c0d8430> <keras.src.layers.convolutional.conv2d.Conv2D object at 0x7f5e70c87c70> <keras.src.layers.convolutional.conv2d.Conv2D object at 0x7f5e70c875e0> <keras.src.layers.convolutional.conv2d.Conv2D object at 0x7f5e70ca83d0> <keras.src.layers.pooling.max_pooling2d.MaxPooling2D object at 0x7f5e6c0da710> <keras.src.layers.convolutional.conv2d.Conv2D object at 0x7f5e6c0daf20> <keras.src.layers.convolutional.conv2d.Conv2D object at 0x7f5e6c0db400> <keras.src.layers.convolutional.conv2d.Conv2D object at 0x7f5e6c0dbc40> <keras.src.layers.pooling.max_pooling2d.MaxPooling2D object at 0x7f5e6c0e85b0>
x = vgg_model.layers[-2].output
x = Dense(2, activation="softmax")(x)
model = Model(inputs = vgg_model.input, outputs = x)
model.summary()
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 224, 224, 3)] 0
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
flatten (Flatten) (None, 25088) 0
fc1 (Dense) (None, 4096) 102764544
fc2 (Dense) (None, 4096) 16781312
dense (Dense) (None, 2) 8194
=================================================================
Total params: 134268738 (512.19 MB)
Trainable params: 126633474 (483.07 MB)
Non-trainable params: 7635264 (29.13 MB)
_________________________________________________________________
opt = Adam(learning_rate=0.0001)
model.compile(loss = "categorical_crossentropy", optimizer = opt, metrics=["accuracy"])
After creating the model now we need to split the dataset into train and test set. Before that we need to one-hot encode the label. For that we are using MyLabelBinarizer() and encoding the dataset. Then we are splitting the dataset using train_test_split from sklearn. We are keeping 10% of the dataset as test set and 90% as training set.
train_data[0].shape
(224, 224, 3)
len(train_labels_data)
6752
train_data=[] will contain all the images and train_labels_data=[] will contain all the labels marking airplane images as 1 and non airplane images (i.e. background images) as 0.
# define independent and target features
X = np.array(train_data)
y = np.array(train_labels_data)
X.shape
(6752, 224, 224, 3)
y.shape
(6752,)
from sklearn.preprocessing import LabelBinarizer
class My_Label_Binarizer(LabelBinarizer):
def transform(self, y):
Y = super().transform(y)
if self.y_type_ == 'binary':
return np.hstack((Y, 1-Y))
else:
return Y
def inverse_transform(self, Y, threshold=None):
if self.y_type_ == 'binary':
return super().inverse_transform(Y[:, 0], threshold)
else:
return super().inverse_transform(Y, threshold)
lb_object = My_Label_Binarizer()
Y = lb_object.fit_transform(y)
Y[25]
array([0, 1])
X_train, X_test , y_train, y_test = train_test_split(X , Y, test_size = 0.10)
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
(6076, 224, 224, 3) (676, 224, 224, 3) (6076, 2) (676, 2)
Now we start the training of the model using fit() method.
rcnn_history = model.fit(X_train, y_train, steps_per_epoch=5, epochs=2, validation_split = 0.2)
Epoch 1/2 5/5 [==============================] - 21s 1s/step - loss: 0.1472 - accuracy: 0.9383 - val_loss: 0.0000e+00 - val_accuracy: 1.0000 Epoch 2/2 5/5 [==============================] - 3s 699ms/step - loss: 0.0393 - accuracy: 0.9996 - val_loss: 0.0000e+00 - val_accuracy: 1.0000
# Create a single figure with two subplots, arranged horizontally
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Plot training loss and validation loss in the first subplot
ax1.plot(rcnn_history.history['loss'], label='Training Loss')
ax1.plot(rcnn_history.history['accuracy'], label='Training Accuracy')
ax1.set_title('Training Loss & Accuracy')
ax1.legend()
# Plot training accuracy and validation accuracy in the second subplot
ax2.plot(rcnn_history.history['val_loss'], label='Validation Loss')
ax2.plot(rcnn_history.history['val_accuracy'], label='Validation Accuracy')
ax2.set_title('Validation Loss & Accuracy')
ax2.legend()
plt.tight_layout()
plt.show()
X_test[0]
array([[[11, 11, 11],
[ 5, 5, 5],
[ 4, 4, 4],
...,
[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, 0]],
[[10, 10, 10],
[ 4, 4, 4],
[ 4, 4, 4],
...,
[23, 23, 23],
[23, 23, 23],
[18, 18, 18]],
[[10, 10, 10],
[ 4, 4, 4],
[ 4, 4, 4],
...,
[17, 17, 17],
[15, 15, 15],
[15, 15, 15]],
...,
[[17, 17, 17],
[19, 19, 19],
[27, 27, 27],
...,
[52, 52, 52],
[57, 57, 57],
[57, 57, 57]],
[[15, 15, 15],
[22, 22, 22],
[25, 25, 25],
...,
[52, 52, 52],
[61, 61, 61],
[74, 74, 74]],
[[16, 16, 16],
[24, 24, 24],
[23, 23, 23],
...,
[72, 72, 72],
[92, 92, 92],
[89, 89, 89]]], dtype=uint8)
model.predict(X_test[0:10])
1/1 [==============================] - 1s 594ms/step
array([[0., 1.],
[0., 1.],
[0., 1.],
[0., 1.],
[0., 1.],
[0., 1.],
[0., 1.],
[0., 1.],
[0., 1.],
[0., 1.]], dtype=float32)
y_test[0:10]
array([[0, 1],
[0, 1],
[0, 1],
[0, 1],
[0, 1],
[0, 1],
[0, 1],
[0, 1],
[0, 1],
[0, 1]])
The above detail shows that the RCNN model can classify for the given image samples whether it has pneumonia or not 100%
import pickle
# Save the ResNet50 model to a file
model_filename = '/content/drive/MyDrive/Capstone/Pneumonia/pickle/resnet50_model.pkl'
with open(model_filename, 'wb') as model_file:
pickle.dump(model_resnet, model_file)
# Save the ResNet50 model to a file
tuned_model_filename = '/content/drive/MyDrive/Capstone/Pneumonia/pickle/tuned_resnet50_model.pkl'
with open(tuned_model_filename, 'wb') as model_file:
pickle.dump(tuned_model_resnet, model_file)
# Save the efficient model to a file
efficientnet_model_filename = '/content/drive/MyDrive/Capstone/Pneumonia/pickle/efficientnet_model.pkl'
with open(efficientnet_model_filename, 'wb') as model_file:
pickle.dump(tl_model, model_file)
import pickle
# Save the efficient model to a file
rcnn_model_filename = '/content/drive/MyDrive/rcnn_model.pkl'
with open(rcnn_model_filename, 'wb') as model_file:
pickle.dump(model, model_file)
Final report will be attached in the form of pdf.